Bayesian Analysis in Data Cubes

نویسندگان

  • Ruibin Xi
  • Youngjin Kim
  • Nan Lin
  • Yixin Chen
چکیده

In the last few years, there has been active research on aggregating advanced statistical measures in multi-dimensional data cubes [10] from partitioned subsets of data. In this paper, we propose a compression and aggregation scheme to support Bayesian estimations in data cubes based on the asymptotic properties of Bayesian statistics. The main application of the technique developed in this paper is data warehousing and the associated on-line analytical processing (OLAP) computing. OLAP allows for interactive analysis of multidimensional data to facilitate effective data mining at multiple levels of abstraction. Earlier work in data cubes [10] supports aggregation of simple measures such as sum() and average(). However, the fast development of OLAP technology has led to high demand for more sophisticated data analyzing capabilities, such as prediction, trend monitoring, and exception detection of multidimensional data. Oftentimes, existing simple measures such as sum() and average() become insufficient, and more sophisticated statistical models are desired to be supported in OLAP. Recently, some researchers developed aggregation schemes for more advanced statistical analysis including parametric models such as linear regression [6, 11] general multiple linear regression [5, 14] logistic regression analysis [23] and predictive filters [5], as well as nonparametric statistical models such as naive Bayesian classifiers [4] and linear discriminant analysis [15]. Along this line, we develop an aggregation scheme support Bayesian estimations in data cubes. Bayesian methods are statistical approaches to parameter estimation and statistical inference which use prior distributions over parameters. Bayesian methods have been successfully applied in many different fields such business, computer science, economics, epidemiology, genetics, imaging and political science. The premise of Bayesian statistics is to incorporate prior knowledge, along with a given set of

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Bayesian Analysis

In the last few years, there has been active research on aggregating advanced statistical measures in multidimensional data cubes from partitioned subsets of data. In this paper, we propose an online compression and aggregation scheme to support Bayesian estimations in data cubes based on the asymptotic properties of Bayesian statistics. In the proposed approach, we compress each data segment b...

متن کامل

SELFI: an object-based, Bayesian method for faint emission line source detection in MUSE deep field data cubes

We present SELFI, the Source Emission Line FInder, a new Bayesian method optimized for detection of faint galaxies in Multi Unit Spectroscopic Explorer (MUSE) deep fields. MUSE is the new panoramic integral field spectrograph at the Very Large Telescope (VLT) that has unique capabilities for spectroscopic investigation of the deep sky. It has provided data cubes with 324 million voxels over a s...

متن کامل

Bayesian and Iterative Maximum Likelihood Estimation of the Coefficients in Logistic Regression Analysis with Linked Data

This paper considers logistic regression analysis with linked data. It is shown that, in logistic regression analysis with linked data, a finite mixture of Bernoulli distributions can be used for modeling the response variables. We proposed an iterative maximum likelihood estimator for the regression coefficients that takes the matching probabilities into account. Next, the Bayesian counterpart...

متن کامل

Bayesian Analysis of Survival Data with Spatial Correlation

Often in practice the data on the mortality of a living unit correlation is due to the location of the observations in the study‎. ‎One of the most important issues in the analysis of survival data with spatial dependence‎, ‎is estimation of the parameters and prediction of the unknown values in known sites based on observations vector‎. ‎In this paper to analyze this type of survival‎, ‎Cox...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009